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Abstract — We study the reliability maximization problem in 
WDM networks with random link failures. Reliability in these 
networks is defined as the probability that the logical network 
is connected, and it is determined by the underlying lightpath 
routing, network topologies and the link failure probability. By 
introducing the notion of lexicographical ordering for lightpath 
routings, we characterize precise optimization criteria for maxi- 
mum reliability in the low failure probability regime. Based on 
the optimization criteria, we develop lightpath routing algorithms 
that maximize the reliability, and logical topology augmentation 
algorithms for further improving reliability. We also study the 
reliability maximization problem in the high failure probability 
regime. 



I. Introduction 

Modern communication networks are constructed using a 
layered approach, with one or more electronic layers (e.g., 
IP, ATM, SONET) built on top of an optical fiber network. 
The survivability of such networks under fiber failures largely 
depends on how the logical electronic topology is embedded 
onto the physical fiber topology. In the context of WDM 
networks, this is known as lightpath routing. However, finding 
a reliable lightpath routing is rather challenging because it 
must take into account the sharing of physical fibers by 
logical links and its impact on the connectivity of the logical 
topology. Hence, the survivability of a layered network is 
a complex function of logical topology, physical topology, 
lightpath routing, and link failure probability. In this paper, we 
study reliable layered network design assuming that physical 
links fail at random with some probability, where multiple 
links may fail simultaneously. 

The probabilistic failure model represents a snapshot of a 
network where links fail and are repaired after a certain time 
as in many practical scenarios |fl~). Hence, the link failure 
probability can be viewed as the average fraction of time that a 
link is in a failed state. This random failure model is somewhat 
general in that it can be used to model both networks with 
rare link failures as well as more frequent failures. It thus 
enables thorough understanding of network survivability in 
various failure regimes. For this reason, several works in the 
literature study survivable network design under the random 
failure model U]-0. 

In the context of layered networks with random physical 
link failures, a natural survivability metric is the probability 
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that given a lightpath routing, the logical topology remains 
connected; we call this probability the cross-layer (network) 
reliability. The cross-layer reliability reflects the survivability 
performance achieved by the layered network. Hence, it is 
desirable to design a layered network that maximizes the 
reliability. Although the single-layer network design prob- 
lem has been extensively studied |5|-|12|, the layered 
network reliability problem remains largely unexplored. 
Existing work in the area |13|-|22| has mostly focused on 
finding a lightpath routing that survives a single physical 
link failure, rather than finding the one with maximum 
reliability. Our work in |32| was the first study to maximize 
the tolerance of such physical failures for a lightpath 
routing, and cross-layer reliability was introduced in ll23l 
to generalize this notion. In particular, we extended the 
polynomial expression for single-layer network reliability 
to the layered setting, and developed approximation algo- 
rithms for reliability computation. We also demonstrated a 
positive correlation between the reliability and Min Cross 
Layer Cut (MCLC; The precise definition of MCLC is 
presented in Section JD in the low failure probability 
regime, and experimented with MCLC as the objective in 
our lightpath routing algorithm to approximate reliability 
maximization. 

Our goal is to fully characterize the structures that 
contribute to the reliability in a layered network. This 
gives us the precise optimization criterion for maximizing 
the reliability. Although optimizing the exact criterion is 
infeasible in practice, the insight allows us to develop a 
new objective that better approximates reliability maxi- 
mization. 

Typically, real-world networks experience very low link fail- 
ure probabilities, and are designed accordingly. For instance, 
the failure probability of a 1000-mile cable in the Bellcore 
network is estimated to be about 0.006 [24|. However, in 
recent years there has been an increased concern about the 
impact of natural disasters or physical attacks on network 
survivability. Natural disasters, such as earthquakes and hur- 
ricanes or floods can lead to a large number of (possibly 
localized) link failures that cannot be survived by networks 
designed to deal only with isolated failures (25], (26). Worse 
yet, a physical attack on the network by weapons of mass 
destruction, such as an Electromagnetic Pulse (EMP), can lead 
to widespread failures throughout large geographical areas 
Il26ll -ll28l. Such an attack can have a disastrous effect on 
telecommunication links that rely on electronic components 
from fiber amplifiers to regenerators, switches and routers 
for their operation. Worse yet, such an attack is likely to 
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disrupt the power grid [29], 11301 , which can in-turn lead to 
significant additional (cascading) failures of communication 
links, as was recently observed during a blackout event in 
Italy [31 ]. Thus, while typically one may expect extremely low 
failure probabilities, and design networks accordingly, such 
designs may not be robust to widespread failures that may 
result from a natural disaster or attack. Furthermore, it may 
be worthwhile to strengthen networks of critical importance 
so that they can withstand such scenarios. 

Our primary focus in this work is on the low failure 
probability regime, as that is the regime that networks are 
typically designed for. However, to account for the increasing 
concerns with large scale failures, we also characterize net- 
work survivability in higher failure probability regimes. While 
such designs may not be applicable to most networks, they 
may prove valuable to the design of networks with stringent 
survivability requirements. 

One of the major challenges in the area of cross-layer 
survivability is the inherent complexity of the problems. 
For example, in [32|, we proved that the MCLC, a 
critical component in layered network reliability, is NP- 
hard to compute and approximate with within a O(logn) 
factor. Therefore, problems for maximizing cross-layer 
reliability is likely to be intractable. The common approach 
in existing lightpath routing algorithms involves finding 
the physical routes of all logical links jointly, typically 
by solving an ILP that captures the routing decision of 
all the logical links, which is often infeasible for large 
networks. In this paper, we consider a different approach 
by incrementally improving the layered network, one 
logical link at a time. Such an approach has the advantages 
over the existing algorithms: 

1) Scalability: Routing the logical links incrementally 
reduces the problem space significantly. As a result, 
it is more applicable to large networks. 

2) Solution Quality: The incremental approach allows 
us to use a more sophisticated objective function 
that better approximates the cross-layer reliability. 
As a result, the lightpath routings given by the 
new algorithm result in much higher reliability than 
existing algorithms. 

We also apply a similar idea to a different setting where 
the logical topology can be augmented to improve relia- 
bility. We develop an augmentation algorithm to find a 
good placement of a new logical link, and observe that 
reliability can be improved significantly, especially when 
the augmentation increases the MCLC. 

Our contributions can be summarized as follows: 

- We show that in general the optimal lightpath routing 
depends on the link failure probability. 

- We show that for given logical and physical topologies, 
if there exists a uniformly optimal lightpath routing, 
then any locally optimal lightpath routing is uniformly 
optimal. 

- We develop a novel "lexicographical ordering" for light- 
path routing and derive precise optimality conditions in 
both the low and high failure probability regimes. 



- We develop lightpath rerouting algorithms for maximiz- 
ing reliability in the low failure probability regime. 

- We develop a logical topology augmentation algorithm 
for improving the reliability of a given layered network. 

The rest of the paper is organized as follows: In Section 
Ull we present the network model, and introduce the polyno- 
mial expression for the cross-layer reliability and important 
connectivity parameters related to reliability. In Section [TTH 
we study the properties of optimal lightpath routings in the 
low failure probability regime. In Section [IV] we develop 
lightpath rerouting and logical topology augmentation algo- 
rithms for reliability maximization, and in Section [V] we 
present extensive simulation results. In Section [VI] we discuss 
the optimality conditions for maximum reliability in the high 
failure probability regime. 

II. Model and Background 

We consider a layered network Q that consists of the logical 
topology Gl = (Vl,El) built on top of the physical topology 
Gp = (Vp,Ep) through a lightpath routing, where V and E 
are the set of nodes and links respectively. In the context of 
WDM networks, a logical link is called a lightpath, and each 
lightpath is routed over the physical topology. This lightpath 
routing is denoted by / = [/?*,(«, j) 6 E P ,(s,t) E E L ], 
where f'f takes the value 1 if logical link (s, t) is routed over 
physical link and otherwise. 

Each physical link fails independently with probability 
jQ. This probabilistic failure model represents a snapshot of 
a network where links fail and are repaired according to 
some Markovian process. Hence, p represents the steady-state 
probability that a physical link is in a failed state. This model 
has been adopted by several previous works |Q]-[|4]. 

If a physical link fails, all of the logical links (s,t) 
carried over (i.e., (s, t) such that /?? = 1) also fail. A set 
S of physical links is called a cross-layer cut if the failure of 
the links in S causes the logical network to be disconnected. 
We also define the network state as the subset S of physical 
links that failed. Hence, if S is a cross-layer cut, the network 
state S represents a disconnected network state. Otherwise, it 
is a connected state. 

A. Failure Polynomial and Connectivity Parameters 

Assume that there are m physical links, i.e., \Ep\ = m. The 
probability associated with a network state S with exactly i 
physical link failures (i.e., \S\ = i) is p l (l — p) m ~ l . Let iVj 
be the number of cross-layer cuts S with |5| = i, then the 
probability that the network is disconnected is simply the sum 
of the probabilities over all cross-layer cuts, i.e., 

m 

F{p)=Y J N i p\l-p) m - i . (1) 

i=0 

Therefore, the failure probability of a multi-layer network can 
be expressed as a polynomial in p. The function F(p) will be 

'Although we assume uniform link failure probability throughout the 
paper, our results can be readily extended to the case of non-uniform link 
failure probability by replacing each link with multiple links in series 
that fail with the same probability. See | 23| for more details. 
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called the cross-layer failure polynomial or simply the failure 
polynomial. The coefficients Ni's contain the information on 
the structure of a layered graph, determined by the underlying 
lightpath routing. Below we introduce some important coeffi- 
cients related to connectivity. 

Each Ni represents the number of cross-layer cuts of size i 
in the network. Define a Min Cross Layer Cut (MCLC) as a 
smallest set of physical links needed to disconnect the logical 
network. Denote by d the size of an MCLC, then d is the 
smallest i such that Ni > 0, meaning that the logical network 
will not be disconnected by fewer than d physical link failures. 
The MCLC is a generalization of single-layer min-cut to the 
multi-layer setting [32). It was shown in l23l that maximizing 




(b) After rerouting 

Fig. 1 . Example showing that lightpath rerouting can improve the reliability. 
Physical topology is solid line, logical topology is the rectangle formed by 
the 4 corner nodes and 4 edges, and lightpath routing is dashed line. 

Although the MCLC criterion is useful for finding a light- 
path routing with better reliability, it is not sufficient for 
fully characterizing reliable lightpath routings. For example, 
consider the two lightpath routings in Fig Q] The two lightpath 
routings have the same MCLC value of 2. However, for every 
value of p, the routing in Fig. |l(b)| yields better reliability than 
the one in Fig. |l(a)| This example shows that there are more 
precise conditions for optimal lightpath routings, beyond the 
MCLC maximization criterion. In Section [Ell] we develop new 
optimization criteria that characterize in greater detail optimal 
lightpath routings in the low failure probability regime. 

Furthermore, the routing in Fig. |l(b)| can be obtained by 
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(; (b) Lightpath routing 

Fig. 2. Example showing that the reliability can be further improved via 
logical topology augmentation: in (a), dashed lines are added lightpaths. 



In addition to the lightpath rerouting approach, the new 
optimization criteria can also be used to further enhance the 
reliability in a different manner. In particular, we consider 
logical topology augmentation. For instance, suppose that two 
(diagonal) logical links are added to the logical topology in 
the example of Fig. Q] (see Fig. |2(a)) . Fig. |2(b)| is an example 
of routing the two new lightpaths. The new network has far 
better reliability than the old one in the low failure probability 
regime since the MCLC value has been raised from 2 to 3. 
This example shows that augmenting the logical topology can 
significantly improve the reliability. In Section ITV-BI using the 
new optimization criteria, we study how to choose the new 
logical link that achieves maximum reliability improvement. 

III. Properties of Optimal Lightpath Routings 

We first study the properties of optimal lightpath routings. 
These properties will give insight on how routings should be 
designed for better reliability. Since the failure probability 
p is typically small in many practical scenarios, we mainly 
focus on the low failure probability regime. The properties of 
optimal lightpath routings for large p will be briefly discussed 
in Section |VT] 

A. Uniformly and Locally Optimal Routings 

We start with a discussion of routings that are most reliable 
for all failure probabilities. The observations in this section 
will motivate a local (in p) optimization approach to the design 
of lightpath routing, which is relatively easy compared with 
an optimization over all the values of p. We begin with the 
following definition: 

Definition 1: For given logical and physical topologies, a 
lightpath routing is said to be uniformly optimal if its reliability 
is greater than or equal to that of any other lightpath routing 
for every value of p. 

Therefore, a uniformly optimal lightpath routing yields the 
best reliability for all p G [0,1]. Based on the failure poly- 
nomial of a lightpath routing, one can immediately develop a 
sufficient condition for a uniformly optimal lightpath routing: 

Observation 1: Given a lightpath routing R, let Nf- be the 
number of cross-layer cuts with size i. Then R is a uniformly 
optimal lightpath routing if, for any other lightpath routing R , 
< for all i 6 {0, . . . , m}, where m is the number 
of physical links. 

While it is desirable to design a uniformly optimal routing, 
such a routing does not always exist. Intuitively, for small 
p, only a small number of links are likely to fail simulta- 
neously, and hence for better reliability it is important to 
remain connected after a small number of failures. In contrast, 
for large p, it is likely that a large number of links fail 
simultaneously, and thus it is important to withstand a large 
number of failures. These two objectives conflict because the 
former prefers disjoint lightpath routing whereas the latter 
prefers shortest lightpath routing. 

For example, Fig. [3] shows two different lightpath routings. 
In Fig. |3(a)| the logical links are routed over physically disjoint 
paths, and its reliability is given by 3(1 — p) 4 — 2(1 — p) 6 . 
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(a) Optimal Routing (b) Optimal Routing 

in Low Regime in High Regime 

Fig. 3. Example showing that optimal routings depend on the value of p. 
Physical topology is solid line, logical topology is the triangle formed by the 
3 corner nodes and 3 edges, and lightpath routing is dashed line. 

In contrast, in Fig. |3(b)| every pair of logical links share a 
physical link, and its reliability is (1 — p) 3 . While disjoint 
path routing is considered to be more reliable, it is easy to see 
that in this example the disjoint routing has better reliability 
only for small values of p whereas for large p (e.g., p > 0.7) 
the non-disjoint routing is more reliable. 

Since uniformly optimal lightpath routings are not always 
attainable, we are motivated to focus on locally optimal rout- 
ings, where the probability regime of optimality is restricted 
to a subrange within [0, 1]. A locally optimal lightpath routing 
is defined as follows: 

Definition 2: For given logical and physical topologies, a 
lightpath routing is said to be locally optimal if there exists 
< a < b < 1, such that its reliability is greater than or 
equal to that of any other lightpath routing for every value of 
p 6 [a, b]. In addition, the interval [a, b) is called the optimality 
regime for the lightpath routing. 

Note that a uniformly optimal lightpath routing is also 
locally optimal with optimality regime [0, 1]. Theorem[T]below 
is a crucial result to this study; namely, it reveals a connection 
between local optimality and uniform optimality. 

Theorem 1: Consider a pair of logical and physical topolo- 
gies (Gl, Gp) for which there exists a uniformly optimal rout- 
ing. Then, any locally optimal lightpath routing for (Gl,Gp) 
is also uniformly optimal. 

Proof: Denote by F*(p) the failure polynomial of a 
uniformly optimal lightpath routing. By definition, F*(p) is 
no greater than any other failure polynomial for p 6 [0,1]. 
Consider a locally optimal lightpath routing L with optimality 
regime [pi,f>2]> an d let F L (p) be its failure polynomial. 

The polynomial equation F L (p) — F*(p) = has degree at 
most m and thus has at most m roots unless the polynomial 
F L (p) — F* (p) is trivially zero. However, by the definitions 
of local optimality and uniform optimality, the equation has 
an infinite number of solutions over the interval [pi,p2]- 
Consequently, F L (p) is identical to F*(p), which implies that 
lightpath routing L is also uniformly optimal. ■ 

Motivated by this result, we study locally optimal light- 
path routings. In particular, we develop the conditions for a 
lightpath routing to be optimal for the low failure probability 
regime (small p). 

B. Low Failure Probability Regime 

It is easy to see that in the failure polynomial, the terms 
corresponding to small cross-layer cuts dominate when p is 



small. Hence, for reliability maximization in the low failure 
probability regime, it is desirable to minimize the number 
of small cross-layer cuts. We use this intuition to derive the 
properties of optimal routings for small p. We begin with the 
following definition: 

Definition 3: Consider two lightpath routings 1 and 2. Rout- 
ing 1 is said to be more reliable than routing 2 in the low 
failure probability regime if there exists a positive number po 
such that the reliability of routing 1 is higher than that of 
routing 2 for < p < po- A lightpath routing is said to be 
locally optimal in the low failure probability regime if it is 
more (or equally) reliable than any other routing in the low 
failure probability regime. 

In the following, we characterize the impact of small cuts on 
the reliability. Let dj be the size of the MCLC under routing 
j(= 1, 2). Let Ni and Mi be the numbers of cross-layer cuts 
of size i under routings 1 and 2 respectively. We call the vector 
N = [Ni , Vi] the cut vector. The following is an example of 
cut vectors N and M with d\ = 4 and di = 3: 

i 1 2 3 4 5 •■■ m 
Ni 20 26 ■■■ 1 
Mi 9 19 30 ■■■ 1. 

Using cut vectors of lightpath routings, we define lexicograph- 
ical ordering as follows: 

Definition 4: Routing 1 is lexicographically smaller than 
routing 2 if Nd < Md where d is the smallest i at which 
N and Mi differ. 

Note that a lightpath routing with a larger MCLC size is 
lexicographically smaller by Definition |4] In the above ex- 
ample, we have d — 3 and Nd < Md, hence routing 1 
is lexicographically smaller. Therefore, if a lightpath routing 
is lexicographically smaller than another, it has fewer small 
cross-layer cuts and thus yields better reliability for small p. 

Theorem 2: Given two lightpath routings 1 and 2 with cut 
vectors [iVj|z = 0, . . . , m] and [Mi\i — 0, . . . , m] respectively, 
where m is the number of physical links, if routing 1 is 
lexicographically smaller than routing 2, then routing 1 is more 
reliable than routing 2 in the low failure probability regime. In 
particular, let d — min \i : Mi ^ N; \ be the index where the 

0<i<m 

elements in the cut vectors first differ. Then, lightpath routing 

1 is more reliable than routing 2 for p < p = (d+1)(A /^ Afd) . 

2m {d) 

Proof: This is implied by Theorem |3| which will be 
discussed in Section UlI-CI ■ 

Clearly, Theorem [2] leads to a local optimality condition; 
that is, if a lightpath routing minimizes the cut vector lex- 
icographically, then it is locally optimal in the low failure 
probability regime. An interesting case is when routing 1 has 
larger MCLC than routing 2 (as in the above example). In this 
case, routing 1 is lexicographically smaller than routing 2, and 
Theorem [2] implies the following corollary. 

Corollary 1: If d\ > di, then routing 1 is more reliable 
than routing 2 in the low failure probability regime. 

Consequently, a lightpath routing with the maximum size 
MCLC yields the best reliability for small p. We note that the 
same result was shown in [23 1 . Similarly, routing 1 is also 
lexicographically smaller than routing 2 when they have the 
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same size of MCLC but routing 1 has fewer MCLCs. This 
leads to the following result: 

Corollary 2: If d% = d 2 and N dl < M d2 , then routing 1 is 
more reliable than routing 2 in the low probability regime. 

The expression for po given in Theorem [2] also provides 
some insight into how the difference of the cut vectors affects 
the guaranteed regime. For example, if d is small and M d —Ng 
is large, the guaranteed regime is larger. In other words, if 
one lightpath routing has fewer small cross-layer cuts than the 
other, it will achieve higher reliability for a larger range of p 
in the low probability regime. 

Therefore, for reliability maximization in the low failure 
probability regime, it is desirable to maximize the size of the 
MCLC while minimizing the number of such MCLCs. This 
condition will be used to develop the algorithms in Section 

El 

C. Extension of Optimal Probability Regimes 

The expressions in Theorem [2] only consider the first el- 
ement in the two cut vectors that are different. As a result, 
the guaranteed regime is rather conservative. For instance, 
the expression fails to capture the uniform optimality for a 
lightpath routing that satisfies the condition in Theorem Q] In 
this section, we will develop a more general expression for the 
regime bound that includes other elements in the cut vectors. 

Consider two lightpath routings 1 and 2. Let Fj(p) be the 
failure polynomial of routing j (= 1,2), and Ni's and Mi's 
be the coefficients in Fi (p) and F 2 (p) respectively. Define the 
following vector of partial sums: 



' k 
.i=0 



NAk = 0, 



The vector M is defined similarly. Note that the i-th element 
l$i of vector is the total number of cross-layer cuts of size 
at most i. We first extend the definition of lexicographical 
ordering as follows: 

Definition 5: Lightpath routing 1 is said to be fc- 
lexicographically smaller than lightpath routing 2 if 



fc = max 



{j ■ 



V« < d + j\ and fc > 1, 



where d is the position of the first element where the two cut 
vectors differ. 

Therefore, a lightpath routing is lexicographically smaller 
(in the original sense) if and only if it is fc-lexicographically 
smaller for some fc > 1. The fc-lexicographical ordering thus 
compares two lightpath routings based on structures beyond 
the smallest cuts, making it possible to establish a larger 
optimality regime. Roughly speaking, the value of fc reflects 
the degree of dominance of a lightpath routing in the low prob- 
ability regime: a fc-lexicographically smaller lightpath routing 
means that it has fewer "small" cuts, where the definition for 
"small" is broader if fc is larger. 

It is obvious that when p < 0.5, the failure probability of 
a cross-layer cut is a non-increasing function of the cut size, 
because - p)" 1 ' 1 > p l+1 {l - for p < 0.5. 

Suppose that routing 1 has smaller total number of cuts of 



size up to i than routing 2, i.e., Ni < Mi. To compare cross- 
layer cuts of size at most i+1, suppose further that the relative 
increment Ni+i — My- 1 in the number of larger cuts does not 
exceed the surplus M t — iv, from smaller cuts, i.e., Ni+i < 
Mj+i. Then, with respect to cut size at most i + routing 1 
will have smaller failure probability than routing 2, provided 
that the same was true for cut size up to i. This observation 
leads to the following theorem on the relationship between 
lexicographical ordering and probability regime. 

Theorem 3: Given two vectors N=[iVj|i = 0, ...,m] and 
M=[Afi|t = 0,...,m], let Fi(p) = ££=0 ^(1 - pT^ 
and F 2 (p) = EiU-^P^ 1 ~ P) m ~ l - For any j, let Aj = 
J2 (Mi - Ni) and ~tj = max { N ',- K , u \ . If the vector N 



max n — 
i=0 j+l<i<m [ ( J 

is fc-lexicographically smaller than M, then 

Flip) — Fzip) for p < p l = min < 0.5, max Bj > 

I d<j<d+k—l J 

where d = min {i ; Ni < Mi} and 



' 0.5, 

Proof: See Appendix lAl 



if j = m 
otherwise. 



Therefore, the probability regime bound p l Q in Theorem [3] is 
a non-decreasing function of fc, which means that a lightpath 
routing with smaller number of cuts over a larger size range 
will be more reliable over a larger probability regime. This 
is consistent with the conclusion in Section IIII-BI that the 
lightpath routing design should minimize the lexicographical 
ordering of the cut vector. 

Theorem [2] follows from Theorem [3] For a lexicographi- 
cally smaller lightpath routing, the term Bd is given by: 

1 _ 1 

ifi + "^Ui)/^ ~ TTi + - N d ) 

(d + l){M d -N d ) 



> 



> 



> 



m(M d -N d ) + (d+l)(£j 

jd + l)(M d -N d ) 
mft) + (m-d)tt) 
(d+l)(M d -N d ) 
2m (™) 



where the first inequality is due to d d < 1. 

An interesting special case is when d + k — 1 = m, that 
is, Mj > Nj for all j = 0, . . . , m. In that case, the term 
B d+ k-i = B m = 0.5, implying that the optimality regime is 
[0,0.5]. We summarize this as the following corollary: 

Corollary 3: If N j < Mj for all j = 0, . . . , m, then 
lightpath routing 1 is at least as reliable as lightpath routing 
2 for p < 0.5, i.e., F^p) < F 2 (p) for p < 0.5. 

Note that the condition in Corollary |3]requires every partial 
sum in the vector M to be at least the corresponding partial 
sum in the vector N, which is a stronger condition than 
the lexicographic comparison in Theorem |2] This stronger 
condition allows the better optimality regime to be established 
in Corollary [3] 
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IV. Maximizing Reliability by Improving 
Lightpath Routing and Logical Connectivity 

In this section, we explore ways to improve the reliability 
of a layered network. Typically, the physical topology is static 
and difficult to change. Therefore, the reliability of a layered 
network can be improved by one of two ways: (i) improving 
the lightpath routing, or (ii) improving the logical topology. 

We have shown in Section [Til] that when physical link 
failures are rare, the lightpath routing that minimizes the 
lexicographical ordering will maximize the reliability. This 
new observation gives us an exact optimization criterion for 
designing reliable layered networks. 

As discussed in Section JJ the traditional approach of 
jointly routing all logical links is often too complex, which 
makes it infeasible for larger networks. This motivates the 
incremental approach introduced in this section, where the 
layered network is improved one logical link at a time. This 
significantly reduces the problem space and allows us to 
use a more sophisticated objective function based on the 
optimziation criterion we studied in Section 1 1 1 1 1 

Within this context, we study two optimization problems 
that are fundamental to improving the lightpath routing 
and logical connectivity: 

1) Lightpath Rerouting: Given the physical, logical 
topologies and a lightpath routing, find a logical link to 
reroute, such that the resulting reliability is maximized. 

2) Logical Topology Augmentation: Given the physical, 
logical topologies and a lightpath routing, find a pair 
of logical nodes, as well as a physical path between the 
nodes, such that the addition of the corresponding logical 
link will provide maximum reliability improvement. 

The above two problems are basic building blocks for 
designing reliable layered networks. For example, given an 
existing layered network, we can iteratively reroute existing 
lightpaths in the network until no further improvement is 
possible (e.g. Figure 3). Hence, given the physical and 
logical topologies, the iterative rerouting algorithm can be 
described as follows: 

1) Generate an arbitrary initial lightpath routing. 

2) Reroute a logical link using ILP/approximation al- 
gorithm introduced in Section JV) 

3) Repeat Step 2 until no further improvement can be 
made by rerouting a single lightpath. 

Similarly, if it is feasible to add new logical links, we 
can iteratively augment the logical topology to further 
improve the reliability; and studying the Logical Topology 
Augmentation problem allows us to select such new logical 
links effectively. These iterative rerouting and augmenta- 
tion algorithms will be used for performance evaluation in 
Section [Vj 

In this section, we present algorithms for the rerouting and 
augmentation problems. In the next section, we will evaluate 
the effectiveness of rerouting and augmentation on improving 
cross-layer reliability. 




(c) d = 2, N d = 5 (d) d = 2, N d = 3 



Fig. 4. Improving reliability via lightpath rerouting. The physical topology 
is in solid lines, and the lightpath routing of the logical topology is in dashed 
lines. The MCLC value and the number of MCLCs in the lightpath routings 
are denoted by d and Nd- 

A. Lightpath Rerouting 

Given a layered network and its lightpath routing, the 
objective of the Lightpath Rerouting problem is to find the best 
way to reroute a lightpath, so that the reliability improvement 
is maximized. Recall that with low link failure probability, the 
reliability of a network is maximized when the lexicographical 
ordering of its cut vector is minimized. Therefore, the most 
effective reroute should maximize the MCLC of the resulting 
lightpath routing, and also minimize the number of MCLCs. 

In the following sections, we first analyze the effect of 
rerouting a lightpath and characterize conditions where such 
a rerouting is beneficial. This provides the groundwork for 
our rerouting algorithms. Based on these observations, we 
develop an ILP to find the optimal lightpath to reroute. 
Next, we propose an approximation algorithm that computes 
a near-optimal solution in much shorter time. This gives us a 
scalable algorithm that can be used for designing large layered 
networks. 

1 ) Effects of Rerouting a Lightpath: Let d be the size of the 
MCLC under the initial routing. When the physical route of a 
logical link changes, some cross-layer cuts will be converted 
into non-cuts, and some non-cuts will be converted into cross- 
layer cuts. In the low failure probability regime, the reliability 
will be improved by the rerouting if the following is true: 

1) The conversion of cross-layer cuts with size d to non- 
cuts outnumbers the conversion in the opposite direction. 

2) The MCLC value does not decrease. 

Therefore, we can formulate the lightpath rerouting as an 
optimization problem to maximize the reduction in the number 
of MCLCs, subject to the constraint that no non-cuts of size 
smaller than d is converted to cross-layer cuts. The exact 
conditions for the conversion between cuts and non-cuts are 
described as follows, which will be used as the basis of the 
ILP formulation as well as the approximation algorithm. 

Given the physical topology Gp = (Vp, Ep) and the logical 
topology Gl = {Vl,El), we model a lightpath routing as 
a set of binary constants {///}, where ffj = 1 if and 
only if logical link (s,t) uses physical link in the 

lightpath routing. For a given set of physical links S, we 
define the logical residual graph for S, denoted as Gf , to be 
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< (s,t) G £x : X) /-y = (■ ^ n other words, the residual 

{ (i,j)es J 

graph consists of logical links that use none of the physical 
links in S. By definition, the set S is a cross-layer cut if and 
only if its logical residual graph is disconnected. Given a cross- 
layer cut S, it is called a k-way cross-layer cut if its logical 
residual graph has k connected components. In addition, given 
a cross-layer non-cut T for a lightpath routing, we call a 
logical link (s,t) critical to T if (s,t) is a cut edge of the 
residual graph G\, that is, it is an edge in G\ whose removal 
will disconnect the residual graph. 

The following theorems describe the conditions for a light- 
path rerouting that results in conversions between cross-layer 
cuts and non-cuts. The proofs can be found in [33 1. 

Theorem 4: Let S be a cross-layer cut for a lightpath 
routing. Rerouting logical link (s, t) from physical path P\ 
to P2 turns S into a non-cut if and only if the following 
conditions are true: 

1) S is a 2-way cross-layer cut. 

2) s and t are disconnected in the residual graph for S. 

3) P2 does not use any physical links in S. 

Proof: Suppose all the above conditions are true. Since 
the new route P2 does not use any physical links in S, the 
logical link (s, t) will be in the logical residual graph for S 
under the new lightpath routing. Other logical links that are in 
the original residual graph will remain, because none of their 
physical routes have changed. Therefore, the residual graph 
will become connected now that (s,t) is added to it, which 
implies S becomes a non-cut. It can be easily verified that the 
residual graph will remain disconnected if any of the above 
conditions do not hold. ■ 
Theorem 5: Let T be a cross-layer non-cut for a lightpath 
routing. Rerouting logical link (s,t) from physical path Pi to 
P2 turns T into a cross-layer cut if and only if the following 
conditions are true: 

1) (s,t) is critical to T. 

2) P2 uses some physical link in T. 

Proof: Suppose both conditions are true. Since P2 uses 
some physical fiber in T, the logical link will be removed from 
the residual graph for T under the new lightpath routing. Since 
(s,t) is critical to the non-cut T, its removal will disconnect 
the residual graph, which means that T will become a cross- 
layer cut. It can be easily verified that the residual graph will 
remain connected if any of the two conditions do not hold. ■ 

Therefore, the optimal rerouting should maximize the num- 
ber of cross-layer cuts satisfying Theorem |4] and minimize 
the number of non-cuts satisfying Theorem [5] However, it is 
also important to ensure that none of the non-cuts with size 
smaller than d is converted to cross-layer cuts by the rerouting, 
since otherwise the MCLC value will decrease. The following 
theorem states that only non-cuts with size at least d~ 1 can be 
converted into a cross-layer cut by rerouting a single lightpath. 

Theorem 6: Let d be the Min Cross Layer Cut value of a 
lightpath routing and let JVC be the set of cross-layer non- 
cuts that can be converted into cross-layer cuts by rerouting a 
single logical link. Then \T\ > d - 1 for all T G J\fC. 

Proof: Suppose JVC contains a convertible non-cut S with 



size less than d— 1. Since S is convertible, there exists a logical 
link (s, t) that is critical to S. Now let I be a fiber used by 
(s,t), then the fiber set S U {1} would disconnect the logical 
residual graph and is therefore a cross-layer cut. However, 
such a set contains at most d — 1 fibers, contradicting that d 
is the Min Cross Layer Cut. ■ 

Therefore, when rerouting a lightpath, we need to make sure 
that none of the non-cuts with size d — 1 get converted into 
cuts in order to prevent the MCLC value from decreasing. 
Based on these observations, we next develop an ILP for the 
lightpath rerouting problem. 

2) ILP for Lightpath Rerouting: For the given lightpath 
routing, let d be the MCLC value, and let Cd,JVCd and JVCd-i 
be the sets of 2-way cross-layer cuts with size d, non-cuts with 
size d, and non-cuts with size d — 1 respectively. The lightpath 
rerouting problem can be formulated as an ILP that finds the 
logical link, and its new physical route, that maximizes the net 
reduction in MCLCs. In other words, the optimal reroute 
should result in the minimum number of cross-layer cuts 
with size d, without creating any cross-layers cuts with size 
d-1. 

The ILP can be considered as a path selection problem 
on an auxiliary graph G P = (V P ,E P ), where V P = 
Vp U {u, v}, with u and v being the additional source 
and sink nodes in the auxiliary graph; and E p = E P U 
{(u,x), (x,v) : x G Vp}. Figure [5] illustrates the construction 
of the auxiliary graph. 




Fig. 5. Construction of the auxiliary graph for the ILP. u and v are the 
additional source and sink nodes, and the dashed lines are the additional 
links in the auxiliary graph. 

We first define the following variables and parameters: 

1) Variables: 

• {g s t : (s,t) £ El}: 1 if logical link {s,t) is 
rerouted, and otherwise. 

• {fa ■ e E p y. Flow variables describing a 

path in G P from node u to node v. 

• {y c : c G Cd}: 1 if the cross-layer cut c is converted 
into a non-cut by the lightpath rerouting, and 
otherwise. 

< {z c : c G AfCd}'- 1 if the non-cut c is converted into 
a cross-layer cut by the lightpath rerouting, and 
otherwise. 

2) Parameters: 

• {h c st : c € Cd, (s, t) G El}: 1 if logical nodes s and 
t are disconnected by the 2-way cut c, and 
otherwise. 

. {q c st : c G NC d UWCd-i, (s,t) G E L }: 1 if logical 
link (s, t) is critical to the non-cut c, and other- 
wise. 

. [l^ : Vc G CdUj\fCdliJ\fCd-i, e E P }: 1 if 
physical link (i, j) is in set c, and otherwise. 
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The lightpath rerouting can be formulated as follows: 
REROUTE : Maximize V" y c - V" z c , subject to: 

ceC d cGAfCd 

9st < (Us + f tv )/2, V(a,t)eE L (2) 

E 9st = l (3) 

l tjfv+ E Qst9st<h Vcetfd-u&fieEp (4) 
(s,t)eE L 

l ijfij+ E <&9« <z c +iyceNC d ,(i,j) eE P (5) 

(s,t)£E L 

y c < E h< *t9su VceC d (6) 

(s,t)eE L 

y c <l-lt 3 f tJ , V(i,j)eE P ,VceC d (7) 

{(i,j) : fij — 1} forms an (u,u)-path in G P (8) 
/ jj)to 6{0,l},0<!/ c , 2 c <l 

The formulation can be interpreted as a path selection 
problem on the auxiliary graph G p . Constraint which 
requires that the variables fij describe a path from u to v, can 
be expressed by the standard flow conservation constraints. As 
a result, in a feasible solution to the formulation, the variables 
fij represent a path u — > s ~> t — > v, which corresponds 
to the new physical route for the logical link (s,t) after the 
rerouting. 

Constraint (ffji ensures that g st can be set to 1 only if fij 
represents the path u — > s ~» t — > v, and Constraint (01 
makes sure that the chosen (s,t) is indeed a logical link in 
El. Therefore, exactly one logical link (s, t) can have g st = 1, 
and a feasible solution to this ILP corresponds to a rerouting 
of the logical link. 

In Constraint (|4), the two terms correspond to the conditions 
in Theorem [6] The constraint makes sure that at most one 
of the conditions is satisfied, thereby disallowing the non- 
cuts of size d — 1 to be converted into a cross-layer cut. 
Similarly, Constraint (|5) makes sure z c = 1 for any non-cut 
c G NCd that is converted into a cut by the rerouting. 

Finally, Constraints © and © describe conditions 2) and 
3) of Theorem 2] respectively. Therefore, y c can be 1 only if 
both conditions are satisfied. Since c also satisfies condition 
1) by definition of Cd, this implies that cross-layer cut c is 
converted into a non-cut when y c = 1. 

Since the objective is to maximize y c and minimize z c , in 
an optimal solution y c = 1 if and only if cross-layer cut c is 
converted into a non-cut, and z c = 1 if and only if non-cut c 
is converted into a cross-layer cut. As a result, the objective 
function reflects the net reduction in the number of MCLCs. 

Finally, note that the variables y c and z c will take on binary 
values in an optimal solution even if they are not constrained 
to be integral. This observation helps to reduce the number of 
binary variables in the formulation. 

The ILP REROUTE approximates the lexicographical or- 
dering minimization by minimizing the number of MCLCs 
in the network. It can be extended to consider cross- 
layer cuts of size larger than d, thus achieving a better 
approximation. In this case, the set of cross layer cuts and 



non-cuts Cd and NCd will be replaced by sets that include 
the cut and non-cuts up to size k > d, denoted as C<k and 
AfC<k- The objective function will be changed to 



Maximize 



E 

cec<j 



y w 



E 

c€AfC< fc 



(9) 



where w c is a weight constant assigned to each cut c so 
that a smaller cut will have weight that dominates cuts of 
larger size. In particular, if k is set to \E P \, the extended 
ILP will return the optimal solution that minimizes the 
lexicographical ordering. However, such a formulation will 
contain an exponential number of variables y c and z c , 
and is generally not feasible for practical use. Therefore, 
in the rest of the paper, we will focus on the problem of 
minimizing the number of MCLCs, though the techniques 
discussed in this paper are also be applicable to the more 
general setting. 

3) Approximation Algorithm for Lightpath Rerouting: For 
larger networks, however, solving the rerouting ILP may still 
be infeasible. Therefore, in this section, we present an ap- 
proximation algorithm for the rerouting problem that provides 
near-optimal solutions within a much shorter time. 

We focus on the following question: Given the lightpath 
routing, and a logical link (s, t), what is the best way to reroute 
(s, t) assuming the routes for all other logical links are fixed? 
A solution to this problem will allow us to solve the lightpath 
rerouting problem, since we can run the algorithm once for 
each logical link, and return the best solution. 

Similar to the previous section, let Cd,MC d and MC d -i 
be the set of cross-layer cuts of size d, non-cuts of size 
d and non-cuts of size d — 1 respectively. Now suppose Q 
is a new physical route for logical link (s,t). Let AfCf 
and JVCf_ 1 be the subsets of MC d and AfCd-i that satisfy 
condition (1) of Theorem [5] These two sets represent the non- 
cuts that can potentially be converted into a cut by rerouting 
(s,t). It immediately follows that any (s,t) path that uses 
a physical link in U TeA /- C st T will create a cross-layer cut 
with size d — 1, which should be forbidden for the new 
physical route. In addition, for any physical link (i,j), the 
set Cf- C = {Te AfCf : G T} represents the non-cuts 
with size d that will be converted into cross-layer cuts if the 
new route Q contains the physical link (i, j). 

Similarly, let Cf C Cd be the set of cross-layer cuts 
that do not satisfy conditions (1) or (2) of Theorem |4] 
This represents the set that will continue to be cross-layer 
cuts regardless of the new physical route Q for (s,t). In 
addition, for each (i,j) G Ep, the cross-layer cuts in the 
set £fj = {S G C d - Cf : G S} will also continue to be 
cross-layer cuts if the new route Q contains the physical link 
as they do not satisfy condition 3) of Theorem [4] 

Now, for each physical link let C VJ — Z£ U Cf- C . 

If a physical link (i,j) is used by the logical link (s,t) in 
the new route Q, it will cause the set dj U CJ* to become 
cross-layer cuts. Since every set of physical links in Cf will 
be cross-layer cuts regardless of the physical route taken by 
(s,t), the lightpath rerouting problem for logical link (s, t) 
can be formulated as choosing the (s,i)-path Q in G P = 
(V P ,Ep - U Te Ncf_T) that minimizes \C(Q)\ = | U( l , J ) eQ 
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Cij | . Although this is an instance of the NP-Hard Minimum 
Color Path [34] problem, a simple ^-approximation algorithm 
exists, as described below: 



Algorithm 1 REROUTE_SP(s, t) 

1: Construct a weighted graph on G P = (V Pl E P — 
UtsjVcj' T), where each edge is assigned with 

weight w(i,j) = 

2: Run Dijkstra's algorithm to find the shortest (s,i)-path in 
the weighted graph. 



Theorem 7: Let Q* be the optimal physical route for (s,t) 
that results in the minimum number of MCLCs, and let Q SP 
be the new route for (s,t) returned by REROUTE_SP. For 
any (s,i)-path Q, let Nd{Q) be the number of cross-layer 
cuts with size d after rerouting (s,t) with Q, where d is the 
size of the MCLC. Then N d (Q SP ) < d ■ N d (Q*). 

Proof: See Appendix IB! ■ 

Therefore, the number of cross-layer cuts of size d given by 
REROUTE_SP is at most d times the optimal reroute. Note 
that if the optimal new route for (s, t) eliminates every MCLC 
of size d, i.e., Nd(Q*) = 0, the approximation algorithm 
will find a new route that achieves that as well. We state this 
observation as the following corollary. 

Corollary 4: REROUTE_SP(s, t) will return a new route 
for (s, t) that increases the size of MCLC of the layered 
network, if such a new route exists. 

We can extend algorithm REROUTE_SP, which is based 
on the Dijkstra's shortest path algorithm, by using the k- 
shortest path algorithm [35] to successively compute the next 
shortest path in G P and keep track of the path Q with the 
minimum value of |£(Q)|. The value k reflects a tradeoff 
between running time and quality of the solution. As we will 
see in Section [VJ by picking a good value of k, we can obtain 
a lightpath routing within a much shorter time than solving 
the ILP without sacrificing much in solution quality. 

A Note on Complexity: The sets C c i and MCd can be con- 
structed by enumerating all the ('^f') subsets of physical 
links and each of them can be classified as a cut or non-cut 
in 0(\E\l) time by running a breath-first search on the 
logical topology. Similarly, for each subset S e C& U MCd, 
we can decide whether each of its member (i, j) is in £y 
and NCd-i by breath-first search. Therefore, the time to 
compute all dj is 0(( |B / )(\E L \ + d)) = 0(\E P \ d \E /L \). 
Overall, the time complexity to construct the graph G P is 
0{\E P \ d \E L \). The /c-shortest path algorithm on G P can 
be run in 0{k\V P \{\E P \ + \V P \ log \V P \)) time (35]. There- 
fore, the overall time complexity of REROUTE_SP(s, t) is 
0(\E P \ d \E L \ + k\V P \(\E P \ + \V P \log\V P \)). 

B. Logical Topology Augmentation 

The Logical Topology Augmentation problem involves find- 
ing the best way to augment the logical topology with a single 
logical link, in order to maximize the reliability improvement. 
Even though the augmentation problem has been extensively 
studied for single-layer networks, [36|-[40|, this has not been 



studied before in the context of multi-layer networks. In 
addition to the placement aspect of finding the end points for 
the new link as for the single-layer networks, there is also 
the routing aspect for the layered networks. This adds a new 
dimension of complexity to the augmentation problem. 

As it turns out, the insights from our study of the lightpath 
rerouting problem are largely applicable to the logical topol- 
ogy augmentation problem. In the following sections, we will 
first discuss the similarity between the augmentation problem 
and the lightpath rerouting problem, and then develop a similar 
ILP formulation and approximation algorithm. 

1 ) Effects of a Single-Link Augmentation: Similar to the 
rerouting problem, the new logical link chosen by the aug- 
mentation algorithm should maximize the reduction in the 
number of MCLCs. However, unlike rerouting, adding a new 
link never converts a non-cut into a cross-layer cut. Therefore, 
in augmentation we only need to consider the effect of the new 
logical link on the existing cross-layer cuts. 

Suppose that an initial lightpath routing is given for the 
physical topology G P — (V P ,E P ) and the logical topology 
Gl = {Vl,E l ). Let d be the size of the MCLC under the 
initial routing. Let Gf be the logical residual graph for any 
cross-layer cut 5, that is, the logical subgraph in which the 
logical links do not use any physical links in S. The following 
theorem characterizes the effect of a single-link augmentation: 
The proof can be found in l33l . 

Theorem 8: Let S be a cross-layer cut for a lightpath 
routing. Augmenting the network with a new logical link (s, t) 
over physical route P converts a cross-layer cut S into a non- 
cut if and only if: 

1) S is a 2-way cross-layer cut. 

2) s and t are disconnected in the residual graph for S. 

3) P does not use any physical links in S. 

Proof: The proof is the same as Theorem |4] The new 
logical link will make the residual graph connected if and 
only if the above conditions are true. ■ 

Note that the characterizations for augmentation (Theo- 
rem [HJ and rerouting (Theorems |4] and |5) differ only in 
that the conditions in Theorem [5] are no longer applicable 
to augmentation, because augmentation never converts any 
non-cut into a cross-layer cut. Therefore, we can revise the 
ILP REROUTE accordingly to formulate an ILP for the 
augmentation problem. 

2) ILP for Logical Topology Augmentation: We will revise 
the ILP REROUTE presented in Section|lV]to develop the ILP 
for the augmentation problem. In REROUTE, the variables 
{z c — 1 : c e MCd} correspond to the set of non-cuts that 
will be converted into cuts by the rerouting, and Constraints 
dU and (0 describe the conditions for such conversion. As 
previously discussed, such conversion is not applicable in 
augmentation and therefore these variables and constraints 
can be removed from the ILP. In addition, unlike rerouting 
where we choose from the set of existing logical links, in 
augmentation we can pick any two logical nodes for the 
new logical link. Therefore, we will replace the variable set 
{ 9st : (s,t) G E L } in REROUTE by {g st : (s,t) e V L x V L } 
and remove Constraint (|3). This gives us the following ILP 
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for augmentation: 

AUGMENT : Maximize V" y c , subject to: 

9st < {fus + M/2, V(s, t)eV L x V L (10) 

y c < J2 h Zt9.u VceC d (11) 
(s,t)ev L xV L 

v° V(i,j)eE P> Vc£C d (12) 

: fij = 1} forms an (u, u)-path in G P (13) 

/y,ff»te {0,1} ,0 < t/ c < l 

Similar to the interpretation of REROUTE, in a feasible 
solution to AUGMENT, the variables represent a path 
u — >• s ~> i — >• u, as described by Constraint dl3t , This 
corresponds to the new logical link to be added, along with 
its physical route. Constraint (TToT > ensures that g st — 1 if and 
only if (s,t) is the new logical link selected. Constraints (fTTT i 
and (TT2l describe the conditions in Theorem [8] In particular, 
the variable y c describes whether the cross-layer cut c is 
converted into non-cut by the augmentation. Therefore, the ILP 
maximizes the number of such conversions, which translates 
to maximizing the improvement in reliability. 

3) Approximation Algorithm For Logical Topology Aug- 
mentation: One can also design an approximation algorithm 
similar to REROUTE_SP introduced in Section II V- A3 1 for 
the logical topology augmentation problem. We will again 
focus on the following question: Given a layered network, 
and a new logical link (s,t), find the physical route for (s,t) 
such that the resulting number of cross-layer cuts of size d is 
minimized. We can then apply the algorithm for this problem 
for every possible pair of logical nodes s and f, to find out the 
new logical link that would result in the maximum reliability 
improvement. 

Let d be the size of the MCLC of the layered network 
and Cf be the set of 2-way cross-layer cuts of size d that 
separate the logical nodes s and t. Then by Theorem [8] the 
set Cij — {S e Cf : G S} represents the sets in Cf 
that will remain to be cross-layer cuts if the physical link 
is used by the (s,t) path Q. We can then develop an 
approximation algorithm for the augmentation problem similar 
to REROUTE SP: 



Algorithm 2 AUGMENT_SP(s, t) 

1: Construct a weighted graph on Gj> = (Vp,Ep), where 
each edge is assigned with weight w(i,j) = \Cij\. 

2: Run Dijkstra's algorithm to find the shortest (s, i)-path in 
the weighted graph. 



Since each cross-layer cut S in CJ* has size d, there are 
exactly d physical links such that S <E Cij. As a result, 
AUGMENT_SP is a ^-approximation algorithm, with the same 
proof as Theorem [7] 

V. Simulation Results 

The single-link rerouting and augmentation methods devel- 
oped in the previous section can be used as a building block for 



improving the reliability of an existing layered network. For 
example, by iteratively rerouting the logical links for a given 
lightpath routing until no further improvement is possible, 
we can obtain to a locally optimal solution. In this section, 
we study the effectiveness of such approach via extensive 
simulation studies. 

A. Iterative Rerouting for Survivable Lightpath Routing 

We first apply iterative rerouting to solve the Survivable 
Lightpath Routing problem, whose objective is to obtain 
a lightpath routing that maximizes the reliability for given 
physical and logical topologies. The Survivably Lightpath 
Routing has been previously studied in the literature, where 
the best known algorithmn ll32l is based on an ILP fomulation 
that maximizes the MCLC of the network. In contrast, the 
objective for lightpath rerouting algorithm is based on the 
lexicographical ordering of the cut vector, which captures more 
precisely the survivability characteristics of the network. As 
we will learn from the result, using this improved objective 
significantly improves the quality of the solution. 

In this study, we use the NSFNET (Figure [6}, extended 
with new links to raise its connectivity to 4, as the physical 
topology. For logical topologies, we generate 350 random 
graphs with connectivity 4, ranging from 6 to 12 nodes; and 13 
to 38 links. For each algorithm under evaluation, we compute 
a lightpath routing for each pair of physcial and logical 
topologies. The average reliability among the 350 lightpath 
routings will be presented as the performance metric. 




Fig. 6. The extended NSFNET. The dashed lines are the new links. 

We will first study the effect of the different initial lightpath 
routings on the reliability of the final solution. Next, we'll 
compare the performance of the rerouting algorithm variants 
based on ILP and the approximation algorithm. Throughout 
these studies, we also compare the solutions generated by 
these algorithms with the solution generated by the best known 
lightpath routing algorithm in the literature, MCFlf ll32ll 
(denoted as MCF in this paper for simplicity), as well as the 
simple shortest path algorithm SP. 

1 ) Performance of ILP-Based Rerouting: We first evaluate 
the reliability performance of the ILP-based lightpath rerouting 
approach introduced in Section IIV-A2I with initial lightpath 
routings generated by two different algorithms: 

• RRsp: The initial lightpath routing is generated by the 
Shortest-Path algorithm SP, which routes each lightpath 
with minimum number of physical hops. 

• RRmcf: The initial lightpath routing is generated by the 
algorithm MCF introduced in [32 1. 

Compared with SP, MCF provides initial lightpath routings 
with much higher reliability at the expense of longer running 
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time. Given the initial lightpath routing, the rerouting algo- 
rithm repeatedly solves the rerouting ILP in Section HV-A2| to 
improve the reliability, until it reaches a local optimum. 

Figure Q illustrates the average unreliability of the different 
algorithms. Even with initial lightpath routings generated by 
the best known lightpath routing algorithm MCF, the rerouting 
algorithm RRmcf is able to further reduce the unreliability 
of the lightpath routings. In fact, while only 50% of the 
lightpath routings generated by MCF has MCLC 4, which 
is the connectivity of the logical topologies and is therefore 
the highest MCLC value achievable, the rerouting algorithm 
RRmcf is able to archieve this maximum MCLC value 98% of 
the time. This means that the lightpath rerouting approach is 
able to produce lightpath routings that are much more reliable 
than existing algorithms. 

In addition, even though the initial lightpath routings gen- 
erated by SP and MCF differ significantly in reliability, the 
iterative rerouting eliminates most of the difference. This sug- 
gests that the rerouting approach is robust to initial routings, 
and we can use a simple algorithm, such as Shortest-Path, 
to generate the initial lightpath routing and rely on iterative 
rerouting to obtain a reliable solution. 




Fig. 7. 



0.01 0.02 0.03 0.04 

Link Failure Probability (p) 

UnReliability performance by different algorithms. 



Table U shows the average physical path length by 
the lightpaths generated by the different algorithms. The 
higher reliability of the rerouting algorithms comes with 
a cost of longer paths, as the algorithms often select 
the longer physical routes in order to achieve higher 
reliability. This reflects the tradeoff between the reliability 
and bandwidth resource used by the lightpath routings. 



Number of 
Logical Nodes 


Average Path Length | 


SP 


MCF 


KK SP 


RRmcf 


6 


1.93 


2.32 


2.60 


2.57 


7 


1.90 


2.28 


2.58 


2.57 


8 


1.89 


2.30 


2.68 


2.62 


9 


1.88 


2.29 


2.57 


2.60 


10 


1.91 


2.34 


2.68 


2.64 


11 


1.90 


2.32 


2.60 


2.64 


12 


1.86 


2.22 


2.51 


2.51 



TABLE I 

Average path length of the Shortest- Path algorithm SP, 

LIGHTPATH ROUTING ALGORITHM MCF, AS WELL AS THE REROUTING 
ALGORITHMS USING SHORTEST- PATH (RRsp) AND MCF (RRmcf) TO 
GENERATE THE INITIAL LIGHTPATH ROUTINGS. 



Table [TT] shows the average running times of the rerouting 
algorithms, (not including the time to generate the initial 
routings), as well as the average number of rerouting iterations. 
Compared with the lightpath routing algorithm MCF, the 
rerouting algorithms are able to terminate faster with a better 
solution. This is because this iterative rerouting approach 
effectively decomposes the joint lightpath routing problem into 
simpler single-link rerouting steps, where the ILP in each step 
is much smaller than the lightpath routing formulation in MCF. 

Between the two rerouting variants, RRsp requires more 
iterations than RRmcf to reach the local optimum, because 
it starts with a much less reliable initial lightpath routing. 
However, the difference in total running time is less significant. 
This is because the size of the rerouting ILP formulation is 
larger when the MCLC of the lightpath routing is large, and 
thus takes longer to solve. In most cases, RRsp starts with 
an initial lightpath routings with a lower MCLC value. As a 
result, most of the additional rerouting steps consist of solving 
the smaller ILPs to bring up the MCLC value. Therefore, these 
additional steps take much shorter time. 



Number of 


Running Time (seconds) 


Number of Iterations 


Logical Nodes 


MCF 


RR SP 


RRmcf 


RRsp 


RRmcf 


6 


1652 


265 


164 


7.0 


3.0 


7 


1655 


314 


257 


8.9 


4.2 


8 


1732 


500 


365 


10.3 


5.0 


9 


1838 


745 


525 


11.6 


6.2 


10 


2032 


1238 


824 


14.1 


7.3 


11 


2219 


1389 


1280 


14.0 


8.0 


12 


2716 


1268 


1104 


14.1 


8.2 



TABLE II 

RUNNING TIMES OF THE LIGHTPATH ROUTING ALGORITHM MCF, AS 
WELL AS THE REROUTING ALGORITHMS USING SHORTEST- PATH (RRsp) 

and MCF (RR M cf) to generate the initial lightpath routings; 

AND THE NUMBER OF ITERATIONS OF THE REROUTING ALGORITHMS. 



2) Performance of Approximation Algorithm: Next, we 
compare the performance of the approximation algorithm 
introduced in Section II V- A3 1 with the ILP counterpart. As 
discussed, the approximation algorithm is based on the k- 
shortest-path algorithm, where the parameter k reflects a 
tradeoff between running time and reliability performance. We 
evaluate this algorithm, Shortest^ with k =1, 10 and 100. 

We use the lightpath routings generated by the Shortest 
Path algorithm as the initial routings. Figure [8] shows the 
average unreliability of the lightpath routings produced by 
the algorithms. While Shortesti brings in the majority of the 
improvement, increasing the value of k is able to further 
reduce the unreliability. In particular, when k = 100, the 
approximation algorithm performs almost as well as solving 
the rerouting ILP. 

Table [III] compares the running time of the algorithms. As 
shown in the table, the approximation algorithms are signifi- 
cantly faster than the ILP-based algorithm. This suggests that 
the approximation algorithm is promising rerouting alternative 
to the ILP for improving the reliability of large networks, 
without the need to solve complex mathematical programs. 



12 




0.01 0.02 0.03 0.04 

Link Failure Probability (p) 



Fig. 8. Lightpath rerouting: performance of approximation algorithm. 



Number of 
Logical Nodes 


Running Time (seconds) 


KK S p 


Shortest^ 


Shortest id 


bhortestioo 


6 


265 


12 


14 


24 


7 


314 


20 


26 


43 


8 


500 


32 


43 


79 


9 


745 


45 


55 


123 


10 


1238 


68 


91 


199 


11 


1389 


83 


104 


254 


12 


1268 


113 


135 


344 



TABLE III 

Running times of the rerouting algorithms based on ILP (RRsp) 

AND fc-SHORTEST PATHS. 



B. Effects of Logical Topology Augmentation 

Next, we study the effect of augmenting the logical topology 
on the network reliability. We study a 10-node and a 14-node 
logical ring on the augmented NFSNET, as shown in Figure [9] 
and incrementally augment the rings to study the reliability 
improvement from the addition of new logical links. 




(a) 10 Node Logical Ring 




(b) 14 Node Logical Ring 
Fig. 9. Logical rings on extended NSFNET. 

The cross-layer reliability of the networks after each aug- 
mentation step is shown in Figure [10] With link failure 
probability p = 0.01, the unreliability declines as we add 
more logical links to the rings. The key observation from these 
figures is that the improvement in reliability is most prominent 
when the augmentation increases the MCLC of the network. 
This suggests that networks with a small number of MCLCs 
have a greater potential to significantly improve the reliability 



by augmentation, as it is more likely to improve their MCLC 
values by a small number of new logical links. 

In the case where the additional link does not cause an 
MCLC increase, the marginal reliability improvement de- 
creases with the current MCLC value. This means that aug- 
mentation is more effective when the MCLC value is lower. 
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(b) 14 Node Logical Ring 
Fig. 10. Impact on reliability by augmenting logical rings. 



C. Case Study: A Real-World IP-over-WDM Topology 

Finally, we evaluate the performance of the rerouting and 
augmentation algorithms on a large layered network based on 
a real-world IP-over- WDM network. The physical and logical 
topologies, shown in Figure QT| are constructed based on the 
network maps available from Qwest Communications [41 1. 
Both the physical and logical topologies are extended with 
new links so that the graphs have connectivity 4. The physical 
topology has 39 nodes and 72 links, and the logical topology 
has 20 nodes and 101 links. 

The study on larger networks allows us to reevaluate the 
performance of the lightpath algorithms, both in terms of scal- 
ability and solution quality. In this study, we run the following 
lightpath routing algorithms and compare their solutions: 

1) MCF: The multi-commodity flow algorithm introduced 
in |32|. As in Section [V-AI the algorithm is evaluated 
as the performance baseline. 

2) REROUTE: The iterative lightpath rerouting algorithm, 
based on the fc-shortest path algorithm presented in Sec- 
tion IIV-A31 where k is set to 5000 in our experiment. 
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(a) WDM (physical) network. 




Fig. 11. 



(b) IP/MPLS (logical) network. The numbers 
indicate the number of parallel logical links 
between the logical nodes. 

Physical and logical topologies. 



3) AUGMENT n : The logical topology augmentation algo- 
rithm, based on the fc-shortest path algorithm presented 
in Section IIV-B3I where k is set to 5000 in our exper- 
iment. The augmentation algorithm is run successively 
to add n new edges on the lightpath routing given by 
REROUTE, where n — 1, ... ,9. 
The MCLC values and the number of MCLCs of the light- 
path routings generated by each algorithm are shown in Ta- 
ble |IV] These numbers are compared against the lower bound, 
which is computed by counting the number of minimum sized 
physical fiber sets whose removal will physically disconnect 
some logical nodes. These sets of physical links are cross- 
layer cuts regardless of the lightpath routing, and therefore 
will provide a lower bound on the number of MCLCs. 

It was observed in [32| that the survivability performance 
of the multi-commodity flow formulation MCF declines as the 
network size increases. In this case, the solution produced by 
the algorithm only has MCLC value 2. On the other hand, 
the rerouting algorithm REROUTE continues to produce a 
lightpath routing with the maximum possible MCLC value 
4. Augmenting the logical topology can further improve the 
reliability of the layered network by reducing the number of 
MCLCs, though the incremental effect declines as more logical 
links are added to the network. The number of MCLCs hits 
the lower bound when the logical topology is augmented with 
9 additional logical links. 

Figure Q~2] compares the algorithms in terms of the cross- 
layer reliability in the low failure probability regime. As 
suggested by Table IIVI the iterative algorithms achieve signif- 
icantly higher reliability than the existing algorithm MCF (by 
about 3 orders of magnitude). In particular, the majority of the 
improvement is achieved by the lightpath rerouting algorithm 



REROUTE. This is because the lightpath rerouting method 
alone is able to achieve the maximum MCLC value. As we 
observed in Section [V-BI adding logical links is more effective 
only if the new links can raise the MCLC of the network. 
In other words, even without adding new logical links, we 
can obtain a near optimal solution by improving the existing 
lightpath routing via the iterative rerouting method. 



Algorithm 


MCLC 


Number of MCLCs 


MCF 


2 


5 


REROUTE 


4 


216 


AUGMENT! 


4 


84 


AUGMENT 3 


4 


34 


AUGMENT 5 


4 


25 


AUGMENTg 


4 


20 


Lower Bound 


4 


20 



TABLE IV 

MCLC VALUES AND MCLC COUNTS OF DIFFERENT LIGHTPATH 
ROUTINGS. THE LIGHTPATH ROUTING ON A LOGICAL TOPOLOGY 
AUGMENTED WITH n NEW LOGICAL LINKS IS DENOTED BY AUGMENTn . 



MCF 
REROUTE 
AUGMENT-! 
AUGMENTS 
AUGMENTS 
AUGMENT-9 
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Fig. 12. Unreliability of different lightpath routings. 

VI. High Failure Probability Regime 

As discussed in Section H natural disasters or physical 
attacks can lead to widespread network link failures. While 
such events may be extremely rare, certain networks that are 
of critical importance to national security and our day to day 
lives may need to be designed so that they can withstand such 
rare events. Moreover, certain "specialized" networks, such as 
those onboard an aircraft or a ship may need to be designed to 
withstand very high link failure probabilities that result from 
a catastrophic failure event (e.g., well over 50% link failures) 
[42 1 . In this section, we briefly discuss network design in this 
high failure probability scenario. 

In Section IIII-BI we showed that when p is small, it is 
important to minimize the number of small cuts. Analogously, 
for large p, large cuts are dominant, and hence, minimizing 
the number of large cuts would result in maximum reliability. 
In other words, the cut vector should be minimized for large 
cuts for better reliability in the high failure probability regime. 
Similar to the case of low probability regime, we define the 
following vector of partial sums: 



,i=rn—k 



Ni\k = 0, ...,m 
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The vector is defined similarly. Note that the z-th element 
Ni is the total number of cross-layer cuts of size at least 
m — i. We will use these vectors to develop the conditions 
that incrementally include larger cuts and characterize the 
probability regime where one lightpath routing is more reliable 
than any other for large p. 

First, the fc-colexicographical ordering (an analogy to k- 
lexicographical ordering in Section ITlI-CI l is defined as follows: 

Definition 6: Lightpath routing 1 is said to be k- 
colexicographically smaller than lightpath routing 2 if 

k = max |j : Ni < Mi, Vi > c — j'| and k > 1, 

where c is the position of last element where the two cut 
vectors differ. 

In contrast to the /c-lexicographical ordering, this colexico- 
graphical ordering starts from the largest cuts, and incremen- 
tally includes the smaller cuts. The following result is similar 
to Theorem |3] 



Theorem 9: Given two vectors N=[A^|z 



0. 



and M=[Mi\i 



0, ...,m]. For any j, let Aj 



max < N '^' U )■■ If N is 

0<i<m-j"-l I (i) 



(Mi - Ni) and 5 , 

i—m—j 

fc-colexicographically smaller than M, then 

Fi (p) < F 2 (p) for p>p'o = l- max \ 0.5, min C 3 \ , 

I C<J<C+fe— 1 J 

where c — min {i : N m -i < M m _^} and 




if j = m 
otherwise. 



Proof: See Appendix Icl ■ 
The following corollary is analogous to Corollary [3] for the 

high failure regime: 

Corollary 5: If Nj < Mj for all j = 0, . . . , m, then 

routing 1 is at least as reliable as routing 2 for p > 0.5, i.e., 

Fi(p) < F 2 (p) forp > 0.5. 

Combining Corollaries [3] and [5] gives a condition for uniformly 
optimal lightpath routing: 

Corollary 6: If < M~j and N~j < tlj for all j = 
0, . . . , m, then lightpath routing 1 is uniformly optimal. 

Theorems [3] and [9] provide a single optimality regime 
expression for lightpath routings that exhibit different degrees 
of dominance. Note that the conditions of (co)lexicographical 
ordering in Corollary |6] are satisfied by the uniform optimality 
condition TV; < Mi, Mi given in Theorem Q] Therefore, 
this unified theorem allows for a broader class of uniformly 
optimal lightpath routings. 

More importantly, Theorem[9]can be used to derive practical 
conditions for optimal lightpath routings in the high failure 
probability regime. We begin with the following definitions: 

Definition 7: Consider two lightpath routings 1 and 2. Rout- 
ing 1 is said to be more reliable than routing 2 in the high 
failure probability regime if there exists a number po < 1 such 
that the reliability of routing 1 is higher than that of routing 
2 for po < p < 1. 

Definition 8: A cross-layer spanning tree is a minimal set 
of fibers whose survival keeps the logical network connected. 



Hence, if T is a cross-layer spanning tree, then the survival 
of just T \ {(i,j)} renders the logical network disconnected 
for any fiber (i,j) G T. 

Note that the cross-layer spanning tree is a generalization 
of the single-layer spanning tree. However, unlike a single- 
layer graph where all spanning trees have the same size, in a 
layered graph, spanning trees can have different sizes. Thus, 
we define a Min Cross Layer Spanning Tree (MCLST) as a 
spanning tree with minimum number of physical links. 

In the high failure probability regime, it is likely that there 
are a large number of failures. Hence, the MCLST is an 
important parameter in the high failure probability regime 
because logical networks with small MCLST may remain 
connected even if only a small number of physical links 
survive. This intuition together with Theorem |9] leads to 
practical conditions for optimal routings in the high failure 
probability regime. 

Note that in the failure polynomial, Ni < (™) . Let m — c be 
the size of MCLST. Then, c is the largest i such that Ni < (™) , 
and we have Ni = ('™),Vi > c, meaning that more than 
c failures would always disconnect the logical network. Let 
m — Cj be the size of MCLST for routing j. It is obvious 
that if ci > C2 or c\ = C2 & N Cl < M C2 , then routing 1 is 
/s-colexicographically smaller than routing 2. This observation 
leads to the corollaries similar to the low regime case: 

Corollary 7: If ci > c 2 (i.e., if routing 1 has smaller 
MCLST than routing 2), then routing 1 is more reliable than 
routing 2 in the high failure probability regime. 

Corollary 8: If c\ — c 2 and N Cl < M C2 (i.e., routings 1 
and 2 have the same size of MCLST, but routing 1 has more 
MCLSTs), then routing 1 is more reliable than routing 2 in 
the high failure probability regime. 

Therefore, for reliability maximization in the high failure 
probability regime, it is desirable to find a lightpath routing 
that minimizes the size of MCLST and maximizes the number 
of MCLSTs. This observation is similar to the single-layer 
setting where maximizing the number of spanning trees maxi- 
mizes the reliability for large p [8 ]. The major difference in the 
multi-layer case is that, since spanning trees may have different 
sizes, minimizing the size of the Min Cross-Layer Spanning 
Tree becomes the primary objective. Moreover, computing the 
size of the MCLST is NP-hard ll23l . and therefore, designing 
a lightpath routing that minimizes the MCLST is likely to be 
a difficult problem. We developed an ILP-based algorithm that 
finds a lightpath routing with minimum-size MCLST, and its 
details can be found in Appendix iDl 

VII. Conclusion 

We studied the reliability maximization problem in layered 
networks with random link failures. We introduced the notion 
of lexicographical ordering for lightpath routings, and fully 
identified optimization criteria for maximum reliability in the 
low failure probability regime. In particular, we showed that 
a lightpath routing with the maximum size of Min Cross 
Layer Cut (MCLC) and the minimum number of MCLCs 
is most reliable in the low failure probability regime. Based 
on this insight, we developed a novel lightpath rerouting 
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approach to design reliable layered networks for the low Proof: First, note that by definition of 6j, for any i > j: 

failure probability regime. By incrementally improving the ✓ \ 

lightpath routing, this rerouting approach is able to achieve o j I . I > Ni — Mi. (15) 

a locally optimal solution. Our simulation results show that 
the rerouting algorithms developed in this paper are able to 
produce much more reliable lightpath routings than existing 
algorithms (by about 3 orders of magnitude in real IP-over- 
WDM Network), and are more scalable to large networks. i=0 

Using the optimization criteria, we also developed logical Therefore, the lemma is true for k = m — d+ 1. Now suppose 



Iffc = m — d+1, then Lemma Q] implies that, for p < 0.5: 

m 



topology augmentation algorithms that can further improve the 
reliability of a given layered network. 

We also showed that the high failure probability regime 
requires different optimization criteria that a routing with the 
minimum size of Min Cross Layer Spanning Tree (MCLST) 
and the maximum number of MCLSTs maximizes reliability. 
Our results in the high failure probability regime lay the 
foundation for the design of networks facing increased concern 
about large scale failures due to natural disasters or attacks. 

Appendix A 
Proof of Theorem[3] 



k < m 



1. 



If ~ti 



< for some j < k, this implies for 



any d + k < I < m: 



Ni 



>0, 



A d+fc _i + J2 ( M i 
i—d-\-k 

t 

i—d+k 

where the first inequality is due to ([TBV The second inequality 
is due to the fact that d j < 0, and that Ad+k-i > 0, since N 
is fc-lexicographically smaller than M. Therefore, in this case, 
the vector N is also (m — d+ 1) -lexicographically smaller than 



We first prove the following lemma. 
T , Tf . • , , • u- ii ii t u M, and the lemma is true as proved above. Therefore, in the 

Lemma 1: If vector N is fc-lexicographically smaller than 

vector M, then for all j < d + k - 1, where d = rest of the proof, we ^ssume that 8 s > 0. 

min {d: N d < M d }, and for < p < 0.5, since P < 0.5 and A 4 > for all i < d+k-1, by LemmaQ] 

we have, for all j < d + k — 1: 



m-j 



(14) 



Proof: We prove, by induction on j, that (fT~4l > holds for 



£(M< - jvOp^i ~P) m - 1 > t jt P{l -p) 



(16) 



all j < d + k — 1. Since Ao = Mo — No, we have for j = 0, 



A (i- P y 



Next, we will use the following result to bound the tail 
probability of the Binomial distribution: 
Lemma 3 ( [?]): For r > mp, 



i=0 



Therefore, (fl4l holds for j = 0. Now suppose ( TBI holds for 
all i < j for some j < d+ k — 1. Then, we have: 



E 



p\i-pY 



< 



P r (i- P y 



mp 



Therefore, since p < 



i+i 

j^m-NiW-pr-t 

= J2(M t - NiW - P ) m - 1 + (M j+1 - N j+1 )pi +1 (l -p) 

z=0 

>tjpi(i - P ) m - j + (M i+1 - N j+1 y +1 (i -p)™-u+v 

>t^ +1 {l-p) m -^ + (M j+1 - N J+l )pt +1 {l-p) m -^ 



we have: 



< 2±i, by Lemma [3] 



(./+ 



< 



m-O'+l) 



where the first inequality is due to the induction hypothesis, 
and the second inequality is because pj (1 —p) < 1. Therefore, 
by induction, (fl4l i is true for all f< k. ■ 

Lemma 2: Given a fixed k, if A, > for all i < d + k — 1, 
then for any d<j<d + k — 1: 

for < p < min {0.5, Bj}, where: 
f 0.5, 



t=i+i 

m 

vJ + 1 
m 

\j + 1 



In addition, since p < 
(j + l)p I 



j + 1 — 

. (i + j> 

j + 1 — mp 
=^4 — \ -» , we have: 

7fT+^GTi)/ A ^ 



(17) 



(18) 



7 + 1 — mp i — 



< 



1 



a\- 



(19) 



J Vj+i) 



B, 



if j = m 
otherwise. 



It follows that: 

m 

YjiMi-N^i-py 
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= £(M i -JVi)p i (l-p) T 

i=0 
i=0 



=^(1 - P ) m -^~i j 



% 



i=j+l 

to 
3 + 1 
m 
3 + 1 

to 
-7 + 1 



£ (Mi-iviya-p) 

=i+i 

m 



(i + i)p 



j + 1 — mp 



(i + i)p 

j + 1 — mp 



7 



0. 
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The first inequality is due to ([T5l l. the second inequality is 
due to ([Tol l and ([T8l >. and the last inequality is due to ( fT9] l. ■ 

As a result of Lemma [2] we can pick the d < j < d + k — 1 
such that Bj is maximized to obtain the largest upper bound 
for p, and Theorem [3] follows. 

Appendix B 
Proof of Theorem[7] 

Given any (s, t) path Q, define C(Q) — ^(i,j)eQ^ij> ^ 
follows that N d (Q) = \C(Q)\ + \Cf \ = \C(Q)\ + K, where 
K = \Cf\ is a constant. In addition, let w(Q) be the total 
weight sum of the path Q in the weighted graph constructed 
by REROUTE_SP(s,t). 

Since each set of physical links S G C(Q) has size d, we 
have | {(«, j) : S G Cij} | < d, which implies: 

w(Q)= £ 

(»,j')eO 

< £ |{(t,j):£e£tf}| 
se£(Q) 

< d • |£(Q)| = d • {N d (Q) - K) (20) 

Now, since Q SP is the minimum weight (s,t) path in the 
graph, it follows that: 

N d (Q SP ) = \C(Q SP )\+K 
<w(Q SP ) + K 

< w(Q*)+K 

< d ■ (N d (Q*) -K)+K, by Equation 
<d-N d {Q*). 



Appendix C 
Proof of Theorem[9] 

Let N'i = N m -i and M.' = M m _;, for i = 0, . 



,to; 



and letiV fe = Er=o^ and M k 



the vector TV 



E*=o M i- 11 follows that 



A^|i = 0,...,m 



is /c-lexicographically 



smaller than the vector M 
1 — p. Then, by Theorem [3] 



\i = 0, . 



Let g 



for q < min < 0.5, max 



d<j<d+k-l 

0.5, 

l 

In the above expression, we have: 



where: 



if j = to 
otherwise 



and 



7', 



i=0 



max 

j + l<i<m 



AT. - Af, 



Note that Bj = 1 - Cj for rf < j < d + k - 1. Therefore, 
routing 1 is at least as reliable as routing 2 for 

p = 1 — q >1 — min { 0.5, max 

= max { 11.5. i tin i 
This completes the proof. 



d<j<d+fc-l 



d<.j"<d+fc-l 



Appendix D 

Lightpath Routing ILP to Minimize Minimum Cross 
Layer Spanning Tree (MCLST) Size 

As discussed in Section [VI] lightpath routings with smaller 
MCLST size will be more reliable in the high failure prob- 
ability regime. In this section, we present an ILP for the 
lightpath routing formulation that minimizes the MCLST.that 
are optimized for the high failure probability regime. We first 
define the following variables: 

• {fij l( s > t) e El, £ Ep}: Flow variables represent- 
ing the lightpath routing. 

• {VijKhj) G Ep}: 1 if fiber survives, otherwise. 

• {z st \(s,t) G El}: 1 if lightpath (s, t) survives, other- 
wise. 

• {x st | (s, t) G El): Flow variables on the logical topology. 
MCLST : Minimize } y^, subject to: 



Minimize , 



ten 



5> 



\Vl\ 
-1, 



- 1, if s 
if s £ Vj, 







(21) 



l)-z st >a; st , 



{0} 

V(s,t) e #l (22) 



y« > * flt + /# - 1 V(s, t) G E£, V(i, j) G £ P (23) 
{(«, j) : fij = l}forms an (s,t)-path in G P , V(s,t) G E L 
0<Vij<U 0<x st ; z. y ,/^G{0,l} 



E^ - jvop*(i -p)™-* = E( M » : - ^i)?^ 1 - «)' 



i=0 



i=0 



The variables x st represent a flow on the logical topology 
where 1 unit of flow is sent from logical node to every other 
logical node, as described by Constraint (fJT). Constraint (|22| > 
requires these flows to be carried only on the surviving 
logical links, which implies that the surviving links form 
a connected logical subgraph. Constraint (l23~i ensures the 
survival of physical links that are used by any surviving logical 
> 0, links. Since the objective function minimizes yij, the 

(id)eEp 
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optimal solution will represent a minimum set of physical links 
whose survival will allow the logical link to be connected. 

Therefore, the set of physical links with g/y = 1 forms 
a cross-layer spanning tree. As a result, the optimal solution 
to the above ILP yields a lightpath routing that minimizes the 
size of the MCLST. 
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